Adaptive Deep Multi-object Tracking Algorithm Fusing Crowd Density
LIU Jinwen1,2,3, REN Weihong4, TIAN Jiandong1,2
1. Robotics Laboratory, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169; 2. Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169; 3. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049; 4. School of Mechanical Engineering and Automation, Harbin Institute of Technology(Shenzhen), Shenzhen 518055
Abstract:Multi-object tracking technology cannot well solve the problem of multi-object tracking in the scenarios with objects severely occluded, and therefore an adaptive deep multi-object tracking algorithm fusing crowd density is proposed. Firstly, the crowd density maps and object detection results are fused, and the location and the count information of crowd density maps are utilized to correct the detector results to eliminate missing and false detections. Then, adaptive triplet loss is employed to improve the loss function of the re-identification model and thus the discrimination of the algorithm for the re-identification feature is enhanced. Finally, final tracking results are obtained using the appearance and motion information for objects association. It is verified through the experiments that the proposed algorithm effectively solves the problem of multi-object tracking in severely occluded scenes.
[1] FARABET C,COUPRIE C,NAJMAN L,et al. Learning Hierarchical Features for Scene Labeling.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8):1915-1929. [2] CHAN A B,VASCONCELOS N.Counting People with Low-Level Features and Bayesian Regression. IEEE Transactions on Image Processing,2012,21(4):2160-2177. [3] KONG D,GRAY D,TAO H.A Viewpoint Invariant Approach for Crowd Counting//Proc of the 18th IEEE International Conference on Pattern Recognition.Washington,USA:IEEE,2006:1187-1190. [4] RYAN D,DENMAN S,FOOKES C,et al.Crowd Counting Using Multiple Local Features//Proc of the IEEE Conference on Digital Image Computing:Techniques and Applications.Washington,USA:IEEE,2009:81-88. [5] IDREES H,SALEEMI I,SEIBERT C,et al.Multi-source Multi-scale Counting in Extremely Dense Crowd Images//Proc of the IEEE Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2013:2547-2554. [6] LEMPITSKY V,ZISSERMAN A.Learning to Count Objects in Images//Proc of the 23rd International Conference on Neural Information Processing Systems.Cambridge,USA:The MIT Press,2010,I:1324-1332. [7] BEWLEY A,GE Z Y,OTT L,et al.Simple Online and Realtime Tracking//Proc of the IEEE International Conference on Image Processing.Washington,USA:IEEE,2016:3464-3468. [8] WOJKE N,BEWLEY A,PAULUS D.Simple Online and Realtime Tracking with a Deep Association Metric//Proc of the IEEE International Conference on Image Processing.Washington,USA:IEEE,2017:3645-3649. [9] WOJKE N,BEWLEY A.Deep Cosine Metric Learning for Person Re-identification//Proc of the IEEE Winter Conference on Applications of Computer Vision.Washington,USA:IEEE,2018:748-756. [10] RISTANI E,TOMASI C.Features for Multi-target Multi-camera Tracking and Re-identification//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2018:6036-6046. [11] MA Z,YU L,CHAN A B.Small Instance Detection by Integer Programming on Object Density Maps//Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2015:3689-3697. [12] RODRIGUEZ M,LAPTEV I,SIVIC J,et al.Density-Aware Person Detection and Tracking in Crowds//Proc of the International Conference on Computer Vision.Washington,USA:IEEE,2011:2423-2430. [13] FIASCHI L,KOETHE U,NAIR R,et al. Learning to Count with Regression Forest and Structured Labels//Proc of the 21st International Conference on Pattern Recognition.Washington,USA:IEEE,2012:2685-2688. [14] XIE W D,NOBLE J A,ZISSERMAN A.Microscopy Cell Coun-ting and Detection with Fully Convolutional Regression Networks.Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization,2018,6(3):283-292. [15] ZHANG C,LI H S,WANG X G,et al.Cross-Scene Crowd Counting via Deep Convolutional Neural Networks//Proc of the IEEE Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2015:833-841. [16] ZHANG Y Y,ZHOU D S,CHEN S Q,et al.Single-Image Crowd Counting via Multi-column Convolutional Neural Network//Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington,USA:IEEE,2016:589-597. [17] LIU W Z,SALZMANN M,FUA P.Context-Aware Crowd Coun-ting//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2019:5094-5103. [18] JIANG X H,ZHANG L,XU M L,et al. Attention Scaling for Crowd Counting//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2020:4705-4714. [19] LIU C C,WENG X Y,MU Y D.Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2019:1217-1226. [20] JIANG X L,XIAO Z H,ZHANG B C,et al.Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Reco-gnition.Washington,USA:IEEE,2019:6126-6135. [21] WEINBERGER K Q,BLITZER J,SAUL L K.Distance Metric Learning for Large Margin Nearest Neighbor Classification.Journal of Machine Learning Research,2009,10(2):207-244. [22] SCHROFF F,KALENICHENKO D,PHILBIN J.FaceNet:A Unified Embedding for Face Recognition and Clustering//Proc of the IEEE Conference on Computer Vision and Pattern Recognition.Washington,USA:IEEE,2015:815-823. [23] HERMANS A,BEYER L,LEIBE B.In Defense of the Triplet Loss for Person Re-identification[C/OL].[2021-01-06].https://arxiv.org/pdf/1703.07737v3.pdf. [24] ZHENG L,SHEN L Y,TIAN L,et al.Scalable Person Re-identification:A Benchmark//Proc of the IEEE International Confe-rence on Computer Vision.Washington,USA:IEEE,2015:1116-1124. [25] ZHENG L,BIE Z,SUN Y F,et al. MARS:A Video Benchmark for Large-Scale Person Re-identification//Proc of the European Conference on Computer Vision.Berlin,Germany:Springer,2014:868-884. [26] MILAN A,LEAL-TAIXE L,REID I,et al.MOT16:A Bench-mark for Multi-object Tracking[C/OL].[2021-01-06].https://arxiv.org/pdf/1603.00831v1.pdf. [27] FORSYTH D.Object Detection with Discriminatively Trained Part-Based Models.Computer,2014,47(2):6-7. [28] BERNARDIN K,STIEFELHAGEN R.Evaluating Multiple Object Tracking Performance:The Clear MOT Metrics.EURASIP Journal on Image and Video Processing,2008.DOI:10.1155/2008/246309. [29] YU F W,LI W B,LI Q Q,et al.POI:Multiple Object Tracking with High Performance Detection and Appearance Feature//Proc of the European Conference on Computer Vision.Berlin,Germany:Springer,2016:36-42. [30] SHENG H,CHEN J H,ZHANG Y,et al.Iterative Multiple Hypothesis Tracking with Tracklet-Level Association.IEEE Transac-tions on Circuits and Systems for Video Technology,2019,29(12):3660-3672. [31] SONG Y M,YOON K,YOON Y C,et al. Online Multi-object Tracking with GMPHD Filter and Occlusion Group Management.IEEE Access,2019,7:165103-165121. [32] LIU Q K,LIU B,WU Y,et al.Real-Time Online Multi-object Tracking in Compressed Domain.IEEE Access,2019,7:76489-76499. [33] ZHU J,YANG H,LIU N,et al.Online Multi-object Tracking with Dual Matching Attention Networks//Proc of the European Conference on Computer Vision.Berlin,Germany:Springer,2018:379-396. [34] BAISA N L.Robust Online Multi-target Visual Tracking Using a HISP Filter with Discriminative Deep Appearance Learning[C/OL].[2021-01-06].https://arxiv.org/pdf/1908.03945v5.pdf. [35] BAISA N L.Online Multi-target Visual Tracking Using a HISP Filter//Proc of the 13th International Joint Conference on Computer Vision,Imaging and Computer Graphics Theory and Applications.Berlin,Germany:Springer,2018,V:429-438. [36] REN W H,WANG X C,TIAN J D,et al.Tracking-by-Counting:Using Network Flows on Crowd Density Maps for Tracking Multiple Targets.IEEE Transactions on Image Processing,2021,30:1439-1452.